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ABSTRACT 

<*S j We derive constraints on the matter density Q m and the amplitude of matter clustering erg 

from measurements of large scale weak lensing (projected separation R = 5 — 30/i -1 Mpc) 
by clusters in the Sloan Digital Sky Survey MaxBCG catalog. The weak lensing signal is 
proportional to the product of f2 ro and the cluster-mass correlation function £ cm . With the 
relation between optical richness and cluster mass constrained by the observed cluster num- 
ber counts, the predicted lensing signal increases with increasing Q m or erg, with mild ad- 
ditional dependence on the assumed scatter between richness and mass. The dependence of 
the signal on scale and richness partly breaks the degeneracies among these parameters. We 
incorporate external priors on the richness-mass scatter from comparisons to X-ray data and 
on the shape of the matter power spectrum from galaxy clustering, and we test our adopted 
model for £ cm against N-body simulations. Using a Bayesian approach with minimal restric- 
tive priors, we find cr 8 (f2 m /0.325) - 5l:)1 = 0.828 ± 0.049, with marginalized constraints of 
fi TO = 0.3251q;o67 and °8 = 0.828+^97, consistent with constraints from other MaxBCG 
studies that use weak lensing measurements on small scales (B, ^ 2/i Mpc). The (f2 TO , erg) 
constraint is consistent with and orthogonal to the one inferred from WMAP CMB data, re- 
flecting agreement with the structure growth predicted by General Relativity for a ACDM 
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cosmological model. A joint constraint assuming ACDM yields Q m — 0.298+q q2,o and 
as = 0-831^o'o2o- F° r tnese parameters and our best-fit scatter we obtain a tightly constrained 
mean richness-mass relation of MaxBCG clusters, A 200 = 25.4(M/3.61 x 10 14 /i _1 M Q ) - 74 , 
with a normalization uncertainty of 1.5% Our cosmological parameter errors are dominated 
by the statistical uncertainties of the large scale weak lensing measurements, which should 
shrink sharply with current and future imaging surveys. 

Key words: methods: statistical — cosmology: cosmological parameters — cosmology: 
large-scale structure of Universe 



1 INTRODUCTION 

The most fundamental question about the origin of cosmic ac- 
celeration is whether it arises from a new energy component or 
from a modification of General Relativity (GR) on cosmologi- 
cal scales. A general strategy to address this question is to com- 
pare the growth of cosmic structure — as measured, e.g., by cos- 
mic shear, redshift-space distortions of galaxy clustering, or the 



abundance of galaxy clusters as a function of mass — to the pre- 
dictions of a GR+dark energy model constrained by geometrical 
probes such as Type la supernovae and baryon acoustic oscilla- 
tions (BAO). In particular, one can compare measurements of the 
matter density fi m and the present-day amplitude of matter clus- 
tering, characterized by erg, the rms matter fluctuation in 8/i _1 Mpc 
spheres, to the values expected from extrapolating cosmic mi- 
crowave background (CMB) anisotropies forward from recombi- 
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nation to z — 00 Cosmological studies with clusters traditionally 
use mass proxies derived fro m X-ray or Sunyaev-Zel'dovich (SZ; 



ISunvaev & Zeldovicd[l972T) measurements to constrain the clus 
ter mass function dn/dM (see Allen et alj 2011 for a review). 
In an alternative approach, Isheldon et alj (2009; hereafter S09) 
used stacked weak lensing (WL) to measure t he average mass pro- 
files around clusters in the MaxBCG catalog JKoester et al. 2007) 
derived from the Sloan Digital Sky Survey (SPSS: lYork et alj 
2000), detec ting corre lated mass from scales of 0.1/i _1 Mpc to 
30ft~ 1 Mpc. lRozo et alj (2010: hereafter R10) used the S09 mea- 
surements to constrain the mean relation between optical richness 
and virial mass for MaxBCG clusters, and they combined this re- 
lation with the abundance of clusters as a function of richness to 
constrain fl m and erg. (For a general revi ew of this appro ach in the 
context of cluster cosmology, see §6 of I Weinberg et al.l 2012.) In 
this paper we again target Q, m and erg with MaxBCG clusters, but 
we use the large scale S09 measurements, from projected separa- 
tions of 5-30/i _1 Mpc, 

Roughly speaking, stacked weak lensing measures the prod- 
uct of the matter density Q m and the cluster-mass cross-correction 
function f C m(?"). More precisely, given knowledge of the distances 
to lensing clusters and background sources, the mean tangential 
shear profile of clusters measures the excess surface density profile 
AE(ii), which is related to the 3-d £ C m(r) via 



Qmpc 
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(see §[2] for further details). We can understand how large scale 
AE(i?) measurements constrain Q. m and erg by considering the 
simple case in which optical richness is perfectly correlated with 
cluster mass, so that a sample of clusters above a richness thresh- 
old corresponds to a sample above a mass threshold that has the 
same comoving space density n (where, for simplicity, we con- 
sider a sample at fixed redshift). For a given cosmological model, 
one can predict the matter correlation function f mm (r) and the bias 
factor b c (fi) of halos with space density n, and thus the cluster- 
mass correlation function, which is ^ OT (r) = 6 c (n)£ mm (r) on 
scales in the linear regime. Raising f2 m with all other quantities 
held fixed raises the predicted AE(ii) proportionally. Raising erg 
increases £ mm oc erf and thus increases £ C m{r), but there is a 
partly compensating decline in b c (n). In the limit of very rare, very 
highly biased peaks, b c (n) oc cr^ 1 , yielding AE(iJ) oc Q m o~g, 
but for the space densities of typical cluster samples b c (n) drops 
more slowly than cr^" 1 . Thus, the combination of cluster abundance 
measurements, which determine n, and large scale weak lensing 
measurements, which determine AE(ii), constrains a parameter 
combination ergfim with 7 < 1- In practice, we will use bins of 
cluster richness instead of a single sample above a threshold, and 
the erg-dependence of £ C m(V) is different in the linear and mildly 
non-linear regimes, so there is some leverage to break degeneracy 
between Q, m and erg. 

The simplifications of this description point up several com- 
plications that must be addressed in our analysis. First, the mea- 
surements of AE(i?) have systematic uncertainties related to the 
photometric redshifts of the sources and shear calibration. Second, 
optical richness is a mass indicator with substantial scatter, which 



1 We define h = H /(100 km s _1 Mpc~ 1 ) where H is the Hubble pa- 
rameter at z = 0. 



makes the bias of clusters in richness bins different from that of 
mass bins with the same space density. The two principal "nuisance 
parameters" in our statistical analysis are /3, an overall scaling of 
the AE(Ji) measurements to allow for systematic uncertainty, and 
<7 ln jv 20 o|M, the logarithmic scatter in richness at fixed mass. We 
discuss these nuisance parameters and the priors we adopt on them 
in § [3] We also adopt a prior on the shape of the matter power 
spectrum, so that a value of erg specifies the full shape of fmjn(r). 
The inference of comoving space densities itself depends on Q m , 
which affects the volume element transformation between comov- 
ing distances and observable angles and redshifts. Incompleteness 
and contamination of the cluster sample can also affect the inferred 
space densities and/or bias the estimate of AE(ii), so they must 
also be accounted in the analysis. Despite these complications, we 
find that our constraints are limited by the statistical errors of the 
weak lensing measurements rather than systematic uncertainties. 

A complete cosmological analysis of cluster weak lensing 
would employ AE(iZ) measurements over the full range of ob- 
served scales. Here we restrict our analysis to R ^ 5/i~ 1 Mpc, in 
part to avoid the regime where theoretical predictions of £, C m(r) are 
uncertain, and in part to keep our results complementary to those of 
R10, who use the small scale (R < 2/i -1 Mpc) S09 measurements 
to calibrate their determination of the cluster mass function. One 
important systematic for interpretation of the small scale measure- 
ments is the impact of cluster mis-centering, which must be esti- 
mated from simula tions of the cluster population and cluste r find- 
ing technique (e.g. J ohnston et alj|2007l ; iGeorge etal 1 l2012h . One 
advantage of the approach in this paper is that mis-centering has 
negligible impact at the large scales that we employ. 

In the following section we briefly review our input data, 
the MaxBCG cluster catalog and the S09 weak lensing measure- 
ments. Section[3]presents our analysis method in detail, including 
the model parameters and priors and the procedure for computing 
the likelihood of the data given these parameters. Section [4] tests 
our anal ytic models for A S(_R) (a modified version of that pro- 
posed bv lHavashi & White] 2008, hereafter HW08) against numer- 
ical simulations, and it uses simple mock data sets to test other 
aspects of our analysis procedures. Section|5]presents our cosmo- 
logical constraints and compares them to those from other cluster 
analyses and from CMB data. We address systematic uncertainties 
in S|6] We close, in i|7] with a summary of our findings and a discus- 
sion of future prospects. The reader in a hurry can get an overview 
of the paper from Fig. Q] which compares our best-fit model to 
our input data, Fig.|7] which shows how the AE(i?) prediction de- 
pends on model parameters, and Fig. [9] which presents our derived 
constraints on Q m and erg. 



2 DATA 

2.1 Cluster Catalog and Number Counts 

The MaxBCG cluster catalog dKoester et al] 120071) consists of 
13, 823 clusters i dentified from the imaging data of the SDSS Data 
Release 4 (DR4; lAdelman-McCarthv et alj |2006). The clusters are 
selected as spatial overdensities of red galaxies, which form a tight 
E/S0 ridgeline in the color-magnitude diagram. Each cluster is as- 
signed a richness measure N200, defined as the number of red- 
sequence galaxies with L > 0.4L* in the i-band within a scaled 
radius R200 such that the galaxy density interior to that radius is 
200 times the mean galaxy density. The tight relation between the 
ridgeline color and redshift also allows an accurate photometric 
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Figure 1. Comparison between observables used in the analysis and the best-fit model predictions. Left panel: Cluster number counts from the MaxBCG 
sample (dark/red histograms) and the best-fit model prediction (light histograms with errorbars). Middle panel: Stacked surface density contrast profiles of 
five richness bins measured by S09 (solid circles with errorbars) and predicted by the best-fit model (solid curves), multiplied by a different constant for each 
bin to avoid crowding. Errorbars in the left and middle panels correspond to the square root of the diagonal terms in the covariance matrix. Right panel: Mean 
richness-mass relation (thick solid line) and intrinsic scatter (gray shaded stripe) of the best-fit model. Solid and dashed horizontal lines bracket the richness 
ranges associated with the cluster samples we used for the number counts and AS(fl), respectively. 



redshift estimate for each cluster (Az ~ 0.01). Tests against mock 
catalogs suggest MaxBCG is ~ 90% complete and pure for clus- 
ters with masses ^ 10 14 /t~ 1 M ( 7 ) , and 95% for higher mass clus- 
ters ( iRozo et al]|2007al) . The MaxBCG catalog is nearly volume- 
limited in the redshift range between 0.1 and 0.3 over 7398 deg 2 . 
The large volume and dynamic range in mass make it well suited 
to our cosmological analysis. 

For better control on purity and completeness, we only keep 
10,815 MaxBCG clusters (78% of total) with 7V 2 oo > H for 
the abundance measurement. Since there are only 5 clusters with 
N200 > 120 but they span a large richness range to iV^oo* = 188, 
we need to model the number counts for clusters with A^oo below 
and above 120 separately. We will describe the difference of their 
treatments in our likelihood model in § 13.21 

For the clusters with jV 20 o G {11 • ■ • 120}, Table Q] gives 
our richness binning, and the red histograms in the left panel of 
Fig.Q](discussed further below) show the measured number counts 
in those bins. When predicting these numbers for a set of model pa- 
rameters, we integrate over redshift and account for scatter between 
photometric and true redshift, with a survey area of 7398 deg 2 . To a 
good approximation, the predicted counts are what one would ob- 
tain using the halo mass function at z = 0.23 and the comoving 
volume from z = 0.1 to z = 0.3. 



2.2 Large Scale Cluster Weak Lensing Measurements 

We take our large scale weak lensing measurements from S09, 
who measured the mean tangential shear profiles jt(R) of source 
galaxies around lens clusters in bins of richness. The area of the 
imaging data used for this analysis is somewhat smaller than the 
7398 deg 2 used for the number counts. The mean tangential shear 
is then converted to the mean excess surface density profile AE(7?) 
for each richness bin, 

AE(i?) = E(< R) - £.{R) = j T (R) x E orit , (2) 

where E(i?) is the azimufhally averaged density at projected ra- 
dius R and E(< R) is the mean surface density interior to R (see, 



e.g. lMiralda-Escuddfl99ll : ISheldon et alj|2004l) . The critical sur- 
face density E cr i t above is defined to be 

_ c 2 D S 

^ crlt _ 4ttG D LS D L ' (i) 

where Ds, Dl, and Dls are the angular diameter distances from 
the observer to the source, from the observer to the lens, and be- 
tween the lens and source, respectively. To calculate E cr it, S09 es- 
timated Ds and Dls using the photo-z's of source galaxies, so 
any uncertainties in the photo-z estimates affect the measurements 
of AE(i?), as we will describe in § 13.11 The values of E cr it are 
computed for a spatially flat universe with Q m — 0.28 and a cos- 
mological constant. Over our redshift range, the impact of varying 
this assumption is negligible compared to the statistical errors, so 
we do not adjust E cr i t when fitting cosmological parameters. 

While the signal-to-noise ratio of AE(i?) for each cluster is 
small, S09 stacked the signals among all clusters in each richness 
bin to obtain average AE(iJ) profiles. The stacked signal was de- 
tected from the inner halo (25ft~ x kpc) well into the surrounding 
large scale structure (30/i _1 Mpc). As mentioned in the introduc- 
tion, the small scale measurements were used by R10 for their con- 
straints, and we hope to employ the large scales by interpreting 
AE(i?) as a measure for f2 m £ cm . 

Table [2] summarizes the richness binning for AE(ii), and 
the solid circles in the middle panel of Fig. Q] show the measured 
AE(i?) on large scales. Errorbars are derived from jackknife re- 
sampling (see § |3.2| below) and are correlated between points. We 
take our data values for AE(_R) fro m table 1 of S09 with on e 
important correction. As first noted by iMandelbaum et all (2008), 
the weak lensing signal in S09 appears to be diluted because of 
photometric redshift errors that incorrectly locate some foreground 
gala xies (which cannot be lensed by the clusters) behind the clus- 
ters. IRozo et at I d2009t) estimated that the original S09 measure- 
ments of AE(i?) should be multiplied by factor 1.18, with uncer- 
tainty of 0.04. We adopt this factor of 1.18 to scale the original S09 
AE(i?) and its associated error matrix up as the actual data for the 
analysis, and we refer to these scaled data as the "S09 measure- 
ments" in the rest of the paper. 
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Table 1. Richness bins for the abundance data. The average mass and bias 
of clusters in each bin are computed from the best-fit MaxBCG+WMAP7 
joint model. 



Richness 


No. of Clusters TV; 


{M 200m ) i [h-^Mo] 




11-14 


5167 


0.997 x 10 14 


2.373 


15-18 


2387 


1.404 x 10 14 


2.696 


19-23 


1504 


1.857 x 10 14 


3.010 


24-29 


765 


2.417 x 10 14 


3.359 


30-38 


533 


3.153 x 10 14 


3.772 


39-48 


230 


4.108 x 10 14 


4.260 


49-61 


134 


5.200 x 10 14 


4.769 


62-78 


59 


6.556 x 10 14 


5.353 


79-120 


31 


8.627 x 10 14 


6.170 



3 ANALYSIS 

To obtain constraints on fl m and as, we adopt a Bayesian approach 
with a minimal set of restrictive priors on nuisance parameters. To 
facilitate the analysis, we also utilize information known from other 
experiments, specifically, the shape of the linear power spectrum 
and the scatter of cluster masses at given richness, as priors into our 
analysis. Details on the model parameters, likelihood components, 
and the prior specifications can be found below. 

3.1 Model Parameters 

We assume a flat ACDM cosmology and infer the values of fi m 
and as, along with other nuisance parameters. Any deviation from 
the flat ACDM+GR assumption would manifest itself as incon- 
sistent constraints on Q m and as compared to expectations from 
the CMB, because of growth that differs from predictions of the 
GR+cosmological constant model. Since we introduce the shape 
of the linear power spectrum Pn n (k) inferred from galaxy redshift 
surveys as a prior, we do not assume specific values for the tilt 
of the primordial power spectrum n s , the baryon density flbh 2 and 
the neutrino mass Q v h 2 , but only require them to be consistent with 
the input power spectrum shape and the output constraints on fi m 
and as- Our analysis is also independent of the assumed value of 
the Hubble parameter h, as both the power spectrum shape and the 
weak lensing shear are measured from galaxies and clusters in the 
local universe, so that all distances are in units of ft _1 Mpc. While 
the P(k) shape is determined very well on the scales that are rel- 
evant to galaxies and clusters teeid et al.|[201ol hereafter ReidlO), 
it could swing away on other scales. We therefore allow rotational 
freedom in the P(k) shape by introducing a modification to the 
overall tilt as another parameter An 3 , representing residual uncer- 
tainty in the P(k) shape. The final linear power spectrum Pn n (k) 
is then oc -Piioidio (&)fc Ans » normalized accordingly by the input 
as- We comment more on the P(k) shape in § 14.31 

Following R10, we assume that the mean cluster richness- 
mass relation is a power-law, parameterized by two mean log- 
richnesses, ln/Yi and ln./V2, at Mi = 1.3 x 10 14 /i _1 M© and 
M 2 = 1.3 x 1O 15 /i _1 M , respectively. We investigate the ef- 
fect of allowing deviation from a power-law in § [6] To go from 
the expected mean richness of a cluster of mass M to the actual ob- 
served richness, we assume a log-normal distribution with a con- 
stant scatter 0i n n 200 \m across all cluster masses. Note that, unlike 
R10, we use mass unit h~ Mq rather than Mq throughout the 
analysis, in accordance with our choice of distance units to avoid 
dependence on h. The right panel of Fig. Q] shows the richness- 



Table 2. Richness bins for the stacked AS(iJ) measurements. The aver- 
age mass and bias of clusters in each bin are computed from the best-fit 
MaxBCG+WMAP7 joint model. 



Richness 


No. of Clusters Nj 


(M 20 0m>j [/i _1 M ] 




12-17 


5651 


1.166xl0 14 


2.511 


18-25 


2269 


1.860xl0 14 


3.010 


26-40 


1021 


2.918xl0 14 


3.641 


41-70 


353 


4.822 xlO 14 


4.591 


71+ 


55 


8.459 xlO 14 


6.093 



mass relation predicted by our best-fit model, with In Ni = 2.446, 
In7v2 = 4.148, and scatter (J lnJV900 | M = 0.432 (gray shaded 
band). The solid and dashed vertical lines indicate the richness 
ranges of clusters used in the number count and weak lensing con- 
straints, respectively. 

Note that we refer to {\nN200\M} as the mean "richness- 
mass" relation, as distinct from the mean "mass-richness" relation 
(lnM|/Yaoo). m the presence of scatter, one cannot trivially con- 
vert from one to the other, since there are more low mass halos to 
scatter to high richness than vice versa. Furthermore, since we as- 
sume log-normal rather than Gaussian scatter in richness, the mean 
richness at a fixed m ass {A/200 is not simply exp (InJVaool-W). 
IRozo et ail d2012al) discussed these issues in the more general con- 
text of cluster observables. Finally, we caution that when integrat- 
ing over a bin in richness, the mean In M is not simply the value 
of (lnM|A r 20o) evaluated at the bin center (e.g., compare centers 
of distributions of the same color in the top and bottom panels of 
Fig. [2] which we will discuss more later). These effects account 
fo r the difference bet ween the mean mass-richness relation quoted 
bv lRozoetal] (12009). which is directly the inferred mean mass of 
clusters in specified bins of richness, and the richness-mass relation 
of R10, which is a central power-law derived from a full cosmolog- 
ical fit and more closely analogous to what we do here. 

To account for residual uncertainties in the photometric red- 
shift distribution used in the weak lensing analysis, we divide the 
predicted AS (J?) by a nuisance parameter /3 before comparing 
with the data, so that we are effectively modeling the underlying 
tangential shear profiles while using /3 to characterize the multi- 
plicative bias in the conversion to AE(i2). We adopt a Gaussian 
prior with central value /3 = 1.0 and width 5/3 = 0.06, somewha t 
larger than the uncertainty of 0.04 estimated bv lRozo et alj d2009h . 
We comment more on the constraints on f3 in i) |5.4| and the prior on 
in §13 

We thus have seven parameters in the model that we fit to the 
cluster abundance and large scale AE(J?) data: two cosmological 
parameters (fi m , as) that we are hoping to constrain, and five nui- 
sance parameters, among which are (ln/Vi, In JVb, a ln jv 20 ol*f) of 
the richness-mass relation, j3 as the residual bias of the weak lens- 
ing shape measurement (hereafter referred to as the "weak lensing 
bias"), and An s for the modulation of the P(k) shape. 

3.2 Likelihood 

We model the number counts and weak lensing measurements for 
clusters in different bins of richness. Aiming for better statistical 
rather than extra tomographic constraints, we do not divide our 
sample into multiple redshift bins, but only retain the whole pho- 
tometric redshift range as a single bin (0.1 < z p hoto < 0.3). The 
observable vector in our likelihood model thus has three compo- 
nents: 
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1 . Ni : number of clusters in each richness bin i for i G { 1 • ■ • 9} . 

2. AEj(i?fc): stacked AE profile of richness bin j measured at 
radius R k , for j 6 {1 ■ ■ ■ 5} and fc € {1 • • • 6}. 

3. Number of clusters with N200 > 120. 

We model the combinatorial vector of the 1st and the 2nd com- 
ponents as a multivariate Gaussian (39 variables in total), which 
is fully specified by its mean vector and covariance matrix. Fig. [TJ 
illustrates the observations of Ni (gray/red histograms in the left 
panel) and AE 3 *(iik) (solid circles in the middle panel) used in 
our analysis. The richness bins we employed for cluster abundance 
and weak lensing measurements are listed in Table[TJand[2] respec- 
tively. We adopt the same richness bins that RIO used for number 
counts Ni and weak lensing profiles AE.,-; note that the i and j 
bins are overlapping but not identical, as larger bins are required 
to achieve reasonable S/N in AE(i?). We only use the AE mea- 
surements at large scales (Rk > 5/i -1 Mpc). We comment on the 
choice of cutoff radius at i? m i n = 5/i~ Mpc in ^ 14.21 

For ./V200 > 120, the assumption of Gaussian fluctuations in 
number counts is invalid due to the rarity of extreme clusters. Fol- 
lowing RIO, we model the count of N200 > 120 clusters as a Pois- 
son binomial distribution, which is a sum of independent Bernoulli 
distributions at each integer N200 > 120. The likelihood associ- 
at ed with this tail po pulation of clusters £ t aii is given in equation 3 
of lRozo et alJfcOldn 

The final likelihood C is then simply the product of the Gaus- 
sian likelihood and the Poisson binomial likelihood. 



3.2.1 Expectation Values 

For any given cosmology, the mass function dn/dM and the 
AE(i?) of DM halos can be theoretically predicted as functions 
of mass M at each redshift 2. To convert to the total number N and 
the average AE(7?) of clusters with richness iVaoo and photometric 
redshift z p hoto, we need to convolve with a kernel that relates the 
observables (N200 and z p hoto) to the intrinsic properties (M and 
2). For our purpose, the kernel function coi (M, 2) is defined as the 
expected differential number of clusters with mass M at redshift 
2 that fall into richness bin / and within our photometric redshift 
range, 



(dN,\M,z) 



i e {i,j}. 



(4) 



dMdz 

(we use i for number counts, j for AE bins, and I gener i c.) The 
deriva tion of uji (M, 2) is similar to that in iRozo et all d2007bl 
l20ld) , and we briefly describe the procedures below. 

We start by defining the richness selection function tpi(N2oo), 
which is the probability for a cluster of N200 to be selected into the 
Z-th richness bin. For the richness bins defined by Table [TJ and [2] 
tpi(N2oo) is simply a top-hat bracketed by two ends of the Z-th 
richness bin. For our calculation, however, we need (ipi\M), the 
expected probability for a cluster of true mass M to be selected 
into the Z-th richness bin. Without loss of generality, we drop the 
subscript I of ip so that 



{,p\M) = / dJV 2 ooP(iV2oo|M)V(iV2oo), 



(5) 



where P(A^2oo | M) is the probability of a halo of mass M observed 
with richness ./V200, specified by the richness-mass relation. The 

2 Note they have a typo in equation 3 where the order of the two rhs terms 
should be reversed 



bottom panel of Fig.|2]shows the (ip\M) for five richness bins used 
for the AE(i?) measurements, given our best-fit model. To a first 
approximation, these distributions are Gaussian in In M, centered 
on the value of M that corresponds to the mean mass-richness re- 
lation at the central richness of the bin. 

We also consider the photometric redshift selection function 
4>{ z photo) , which is defined as the probability for a cluster of mea- 
sured 2 p hoto to be selected into the catalog, i.e., a top-hat bracketed 
by the redshift extent of the catalog. Similarly, we instead need the 
expected spectroscopic redshift selection function for a cluster at 
spectroscopic redshift 2 to appear in the catalog 



d2 p hoto-P(2photo|z)(/>(2photo), 



(6) 



where P(2 p i loto |2) is the probability for a cluster at true redshift 2 
to be estimated with photometric redshift 2 p hoto- This probability 
is assumed to be Gaussian , centered on spect roscopic redshift 2, 
wifho-(>p hoto |2) = 0.008 dKoester et ai]|2007l) . 

The product of (il>\M) and (4>\z) then gives the joint selection 
probability for a cluster with mass M at redshift 2 to be selected 
into a given richness bin with 0.1 < 2 p ] 10 to < 0.3. To predict 
number density weighted averages for cluster properties in the bin, 
we also need the differential number of clusters with mass M at 
redshift 2, which is simply the product of the halo mass function 
dn/dM and the differential co-moving volume element dV/dz. 
The final kernel function lj(M, z) is then obtained via equation I0 



. , , , dn dV , , . .,,,„,. 



(7) 



the integration of which gives the expectation value for the number 
counts Ni and Nj 



(Ni} = J dMdzui(M,z) 



I G {i,j}. 



(8) 



The top panel in Fig.[2]shows (dN/d log M) , the distribution of ha- 
los within richness bins, which is simply the integration of lj(M, z) 
over redshift, using our best-fit model parameters. Here we clearly 
see the impact of scatter and mass function slope discussed in £13.11 
the average richness of clusters in a bin of mass is offset from the 
value of the mean relation evaluated at the center of the mass bin. 
For example, a 10 15 /i _1 Mq cluster would have a high probabil- 
ity (~ 40%) of being assigned to the -/V200 £ [40 — 70] richness 
bin (orange-colored distributions), but because 10 15 /i _1 Mq clus- 
ters are much rarer than lower mass clusters, the number of them 
represented in this bin is negligibly small. Conversely, the average 
mass in a bin of richness is offset from the center of {ip\M) for 
that bin — the average mass of clusters with N200 G [40 — 70] is 
~ 5 x lO 14 ft _1 M0 (solid orange circle), and as mentioned in § 13.11 
the quantity exp (In M\N2oo) is even more offset from the center 
of (%1>\M), landing at M ~ 4.3 x lO 14 /^ 1 M Q (vertical orange 
arrow). 

The expectation value for AEj is 



<AE;> = 



(Ni) 



dMdzLjj(M, z) AT,(R\M, 2) 



(9) 



where AT,(R\M, 2) is the AE(i?) of halos with mass M at red- 
shift z. 

For AE(fl|M , 2), by definition jMiralda-Escudd 1 199 it 
Sheldon et al. 20^), 



AT,(R\M, 2) = E(< R\M, 2) - E(#|M, 2), 



(10) 
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Figure 2. Cluster selection for the five richness bins listed in Table [2] Top: 
Distribution of halo masses within each richness bin (dN/dlog M), as 
computed from our best-fit model. The dotted curve gives the total number 
distribution of halos within the five bins. Solid circles and vertical arrows 
indicate the values of (M \N200) and exp (In M\N20o), respectively, for 
each richness bin. Bottom: Richness selection function (iji\M). Each curve 
shows the probability of halos being assigned to each richness bin at mass 
M c , as computed from our best-fit P(A r 200 1 M), Dotted curve is the sum of 
the five curves, giving the probability of halos being included in the AS(i?) 
measurements. In each panel, each color represents one of the five richness 
bins. The top x-axis indicates the mean richness that corresponds to the 
mass at the bottom x-axis, given by our best-fit mean richness-mass rela- 
tion. Vertical dashed lines show the demarcation of richness bins based on 
the top x-axis. 



where 

E(<R\M,z) = 



j dr p r p J dr z £ hm (\J rj + r%,Af) 



X pm,z R 2 



and 



E(R\M,z) = p m ,2 



R 2 +r 2 ,M)dr 



(11) 



(12) 



In the above equations, p m , z = 57 m p c ,o(l + z) is the mean den- 
sity of the universe at z, and (hm(r, M) is the halo-matter cross- 
correlation function at 3-D distance r for halos with mass M. 

For £h m (r, M), we use a variant of the model proposed by 
HW08 



£hm(r,M) = 
Cih = 



£lh if £lh ^ £,2ii ■ 

£2h if £lh < ^2h 

Phalo(r, M) _ 
Pm 

KM) U- 



(13) 



Here £ih and ^2h are the so-called "1-hal o" and "2-halo" terms 
in the halo model (see ICoorav & Shethl 2002 for a review), 
Phaio(?", M) is the NFW density profile of halos with mass M, and 



b(M) is the halo bias function. The difference from the original 
HW08 prescription is that we use the non-line ar matter autocorre- 
lation £ n i computed from the fitting formula o f lSmithetal.U2003h 
instead of the linear prediction £n n . We demonstrate that this modi- 
fication provides an accurate approximation of large scale measure- 
ment of AE(i?|M, z) from N-body simulations in § 14.21 



3.2.2 Covariance Matrix 

The covariance matrix of the model C is comprised of three sub- 
blocks, Cnn (abundance-abundance), Cne (abundance-shear), 
and Cee (shear-shear). We begin with Cnn, which has two in- 
dependent sources of uncertainties: 

• Sample variance due to limited survey volume (a.k.a. cosmic 
variance). 

• Poisson fluctuations in cluster number counting (a.k.a. shot 
noise). 

Thus the covariance between cluster number counts in two richness 
bins i and i' is the sum of two components 



C' 



✓-/sample . ✓-tPoisson 



(14) 



and, as will become apparent below, the diagonal parts of both com- 
ponents scale with survey volume in a similar fashion. We will de- 
scribe each in turn. 

Assuming the clustering bias b of clusters is linear with respect 
to the underlying density fluctuatio n on the scale of the su rvey Rv, 
the sample variance term is simply dHu & Kravtsovll2~003l) 



/-(Sample 
N*N*' 

where 

(h) 



(JV i )(JV i />(6 i ><6 i ,)ff 2 (i2v), (15) 



dMdzb(M)oji(M,z), I € {i, i'}, (16) 



{Ni,i>) 



and (j 2 (J?v) is the variance of the linear density fluctuation field 
on scale Rv, which we assumed to be adequately approximated by 
the radius of a sphere that has the same volume as the survey. On 
relevant scales, a 2 (R) is approximately pro portional to 1/V , so t he 
diagonal terms C^? lc oc {Nf) /V oc V dHu & Kravtsol2003l) . 



The Poisson fluctuation term is trivial, 

>isson c / AT \ 

Ni / =dii>(Ni), 



(17) 



i.e., it is diagonal and the variance is equal to the expectation value 
of the number counts in each richness bin. Our covariance matrix 
differs from that of RIO in that we do not include the "stochastic- 
ity" contribution in RIO. One might think that scatter between rich- 
ness and mass would introduce off-diagonal covariances because 
clusters that scatter out of one richness bin will scatter into a neigh- 
boring bin. Indeed, when the total number of halos is held fixed, 
such covariance do occur. However, when one simultaneously con- 
siders both Poisson sampling and the stochasticity due to scatter 
in the richness-mass relation, one finds that the two are coupled 
in such a way that the naive Poisson terms represent the full co- 
variance matrix. We have explicitly verified this via numerical ex- 
periments. Simply adding the Poisson and "stochastic" covariance 
matrices, as was done in RIO, is incorrect, and the naive Poisson 
term alone captures the full variance due to Poisson fluctuations 
and the stochasticity of the richness-mass relation. 

Fig. [3] compares the two components of C NiNi < as com- 
puted for our best-fit model. The diagonal elements of the sam- 
ple variance component are much weaker than the Poisson errors 
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Table 3. Prior Specifications. The P(k) shape is generated from the best- 



fit cosmological parameters from ReidlO, and <ri n 



M|JV 200 



is not a model 



Sample variance 



Poisson 



Figure 3. 

-/sample 



Comparison between the two components of C NiNi /\ 
^sample (left) and ^Poisson (right), as computed for the best-fit model. 



richness bins are shown on the y-axis of the left panel. 



N'N' 



except for the lowest richne ss bin, where sample variance domi- 
nates ( iHu & Kravtsovll2003l) . Sample variance also produces posi- 
tive off-diagonal terms. Overall, the off-diagonal terms are smaller 
than diagonal terms, but not completely negligible. 

The other two sub-blocks, Cns and C*es, also have contribu- 
tions from sample variance in the cluster-mass correlation function. 
However, the dominant contributions to the AE(i?) errors come 
from statistical errors in the weak lensing measurements them- 
selves. One major contribution to these errors is shape noise from 
the random orientations of source galaxies; with rms galaxy ellip- 
ticity ~ 0.3, one needs N ~ (O.3/7) 2 sources to measure a shear 
7 at S/N — 1, and with the surface density n e g ~ larcmin -2 
typical of SDSS weak lensing data, the S/N <C 1 for any indi- 
vidual clusters. (Tangential shear is roughly ~ 0.01 near the 
cluster virial radius and smaller beyond.) A second contribution 
comes from coherent cosmic shear lensing of source galaxies by 
foreground or background structure not associat ed with the lens- 
ing cluster dDodelsorJl2004lHoekstra et alj|201lh . For the A'S(R) 
covariance matrix in each richness bin, we use the empirical es- 
timates of S09 based on jackknife re-sampling of the data set in 
large area patches, where the measurement regions around clus- 
ters do not overlap, shape noise errors should be diagonal and drop 
with the square root of the number of source galaxies in each ra- 
dial bin. However, as mentioned in S09, the jackknife errors at 
R > 5ft _1 Mpc become substantially larger than this naive ex- 
pectation, and there are significant off-diagonal terms for different 
radial bins. These large, correlated errors presumably reflect the co- 
herent cosmic shear effect described above, though they could also 
be affected by spatially coherent fluctuations in the quality of point 
spread function (PSF) correlation. In principle there is also a Cne 
sub-block to the covariance matrix, but we ignore it here because 
the AE(i2) errors are dominated by measurement error not sample 
variance. 

Additionally, the entire covariance matrix is affected by uncer- 
tainties in the completeness and the purity of the MaxBCG cluster 
sample. Interlopers and missing clusters can both bias the number 
counts, though the two effects tend to cancel. For AS(ii), missing 
clusters only increase errors, while interlopers can bias the mea- 
surement via dilution. The magnitude of this effect is difficult to 
estimate without detailed simulations, since the main source of "in- 
terlopers" will be random superpositions of smaller clusters at dif- 
ferent redshifts, and the weak lensing signal from the superposed 
systems may be similar to that expected from a single system of 
the estimated richness (in which case there is no dilution). For 
A200 > 11, the purity and completeness of the MaxBCG catalog 
is estimated to be at the ~ 95% level or higher, so any associated 



parameter but an observable. Priors that contain the form [a, b] mean the 
parameter in question is restricted to values within that range. Priors that 
contain the form x = a ± 8a refer to a Gaussian prior of mean (x) = a 
and variance Var(x) = (8a) 2 . The combination of the two forms is a 
Truncated Gaussian. Uninformative priors mean the parameter in question 
is absolutely unrestricted. 



Parameter 


Prior 




Uniform on [0.05, 0.95] 


0-8 


Uniform on [0.40, 1.20] 


°"lnJV 200 |A/ 


Uniform on [0.10, 1.50] 


P 


Truncated Gaussian 1.00 ± 0.06 on [0.50, 1.50] 


lnJVi 


Uninformative 


lniV 2 


Uninformative 


An s 


Truncated Gaussian 0.000 ± 0.013 on [-0.1, +0.1] 



P(k) shape "ACDM" Column of table 3 in Reid et al. (2010) 
°"ln Af|iV 200 Gaussian 0.45 ± 0.10; Rozo et al. (2009) 



biases should be small, though they can be coherent across rich- 
ness and radial bins. Following R10, we define the magnitude of 
this bias to be A ~ 1 ± 0.05 and add Var(A) = 0.05 2 to the frac- 
tional errors in all elements of Cnn and diagonal elements of Ces, 
respectively. We comment on our treatments for A in §[6] 



3.3 Priors 

Table [3] summarizes the priors assumed in our analysis. To ensure 
that our results are driven by the data, we place either uniform or 
unrestrictive priors on five of our seven model parameters, Q m , as, 
(Tin JV200 M , m Ni , and In N2 , and conservative truncated Gaussian 
priors on the other two, j3 and An s . We take the P(k) shape to 
be that of the best-fit ACDM model from ReidlO, multiplied by 
(fe/l/iMpc" 1 )^ 3 to allow minor modulation. r| Note that while 
this is a "ACDM" power spectrum for specific cosmological pa- 
rameters, we are treating it here as the empirical description of the 
observed shape of the galaxy power spectrum. We also place pri- 
ors on o- lnA ,/|jv 200 , the converse scatter defined as the dispersion 
in log-mass in the fixed richness bin A2oo=[38, 42]. Following 
R1 0, we take the pr ior on o"i n M|jv 200 directly from the analysis 
in lRozoetalJ ( f2009h . °"in Af|iv 2 oo — 0.45 ± 0.10, which is derived 
by requiring consistency among the MaxBCG Lx-A^oo relation, 
the MaxBCG richness-mass relation from w eak lensing, and the 
L x-M relation measure d in the 400d survey jBurenin et alj|2007h 
bv lVikhlinin etai] (2009), as well as the scatters within each of the 
three scaling relationships. 



4 TESTS USING SIMULATIONS AND MOCK DATA SETS 
4.1 Basic Implementation 

For the linear matter power spectrum in our likelihood calcu- 
lation, we take the cosmological parameters inferred by ReidlO 
an d compute Pn n ( k) us ing the low-baryon transfer function 
of lEisenstein & Hvj jl999h . which is a good approximation to the 

3 The choice of pivotal wavenumber is arbitrary as it does not affect the 
P(k) shape. 
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Figure 4. Comparison between the 3D halo-matter cross-correlation profiles predicted by the original HW08 prescription (dashed lines), predicted by our 
modified prescription that uses the non-linear correlation function (solid lines), and measured from simulation (solid circles with errorbars), for four different 
halo masses at z = 0. For each mass, the lower subpanel shows the ratios between model predictions and simulation measurement. Errorbars on the simulation 
measurement in the upper panels are estimated from jackknife sub-sampling of the simulation box and propagated to the ratio curves. Our analysis in this 
paper uses measurements of AS(i?) beyond a projected co-moving separation of 5h~ 1 Mpc, marked by the vertical dotted lines. 



full transfer function on scales well below the BAO scale. We re- 
fer to this linear power spectrum as Puddioik), as it fits their 
power spectrum measurements of SDSS galaxies by construction. 
To compute the non-li near matter correlat ion function £„i, we use 
the prescription from ISmith et al. (2003) to generate P n \(k) for 
Fourier transforming to £ n i. The halo ma ss f unction and the hal o 
bias function are from iTinker et al.1 d2008h and iTinker et alj d2010h . 
respectively, with the halo mass define d by M = M2oo m = 
200p m V r S phere(r2oom). We use the NFW dNavarro et al J 19961) halo 
density profile for phalo(r, M). The hal o mass-concentrati on rela- 
tionship is from the fitting formula of IZhao et alj 12009J), which 
accurately recovers the flattening of halo concentration at high 
masses. (In MaxBCG we do not e xpect an up turn, which only 
shows up at redshifts beyond 1; see lPrada et alj 2012). In the pa- 
rameter inference stage, the posterior distribution is derived us- 



ing a Markov Chain Monte Carlo (MCMC), where an Adaptive 
Metropolis step method is utilized during the bum-in period to ex- 
pedite the exploration of highly correlated parameter space. For 
each MCMC chain, we perform 320, 000 iterations, 20, 000 of 
which belong to the burn-in period for adaptively tuning the steps. 
To eliminate the tiny amount of residual correlation between adja- 
cent iterations, we further thin the chain by a factor of 10 to obtain 
our final results. 



4.2 Halo-Mass Correlation Function 

To extract maximum cosmological information from the S09 
AE(i?) measurements, one should simultaneously fit the data on 
all scales. However, as already discussed in the introduction, we 
have elected to focus on large scales in this paper so that our 
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Figure 5. Similar to Fig.|4]but for the 2D halo surface density contrast profiles. 



constraints are complementary to those derived for the MaxBCG 
sample by R10, who measured the cluster mass function using a 
richness-mass relation calibrated by the S09 data on small scales. 
Restricting our analysis to large scales also allows us to avoid two 
sources of systematic error, one observational and one theoretical. 
The observational systematic is the effect of cluster mis-centering, 
which tends to depress AE(i?) at small scales. This effect can be 
estimated from detailed simulations, but with some uncert ainties 
associated with the bar yonic physics dSanderson etalj [2009) and 
the optical cluster finder jjohnston et alj2007h . The theoretical sys- 
tematic is the uncertainty in the halo-mass correlation function in 
the transitional region between the NFW halo mass profile and the 
large-scale regime where it is a linearly biased multiple of the mat- 
ter correlation function. We choose our minimum scale Rmin so 
that this theoretical systematic is small compared to the observa- 
tional uncertainties of the S09 measurements. 



To determine the cutoff scale R n 
of our model of AE(i?) beyond i? m 



, and to test the accuracy 
we use halos in cosmo- 



logical si mulations. Th e simu l ation w e use is the "L1000W" pre- 
sented in Tink er et al. I J2008L l2010l) , where the halo mass func- 
tion and bias function are calibrated. It evolves 1024^ particles 
with m p = 6.98 x 10 10 /i -1 Mq in a periodic box of co-moving 
length lOOOft-^Mpc using the Adaptive Refinem ent Tree (ART; 
iKravtsov etall 1 19971 ; iGottloeber & Klypinl l2008h code. For the 
mass scales we consider here, M > 5 x 10 13 ft -1 M®, it has at 
least ~ 1000 particles for each halo and is thus well-suited for the 
study of (hm and AE. For the model of £hm, we adapt the original 
HW08 prescription by using £ n i(f) as in equation d 1 3 b , so that ^ m 
is more accurate on large scales. There are other formulas for £hm 
in the literature, but it is not clear whether any formulation applies 
universally across cosmologies. The large scale behavior should be 
dictated by linear theory in any case, and the HW08 model appears 
adequate for our present application. 

Fig. E] compares different £hm profiles predicted by the orig- 
inal HW08 prescription (dashed lines), predicted by our modified 
prescription (solid lines), and measured from simulation (solid cir- 
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Figure 6. The effect of the P(k) shape prior on the f2 m -<Tg constraint using mock data. The two left panels compare the confidence regions derived with (top) 
and without (bottom) using the P(k) shape prior for the mock data with the original jackknife errors. The two right panels show a similar comparison for the 
mock data with weak-lensing errors shrunk by 1/2. In each panel, the contours in the embedded sub-panels are the confidence regions for the 10 individual 
realizations, and the contours in the main panel are the results after combining all 10 MCMC chains. Each set of contours shows 95% and 68% confidence 
regions inwards, and the star on top indicates the input cosmology (f2 m =0.30, (78=0.80) for generating the mock data. For the "no P(k) prior" cases, we 
compute the linear power spectrum explicitly for each iteration in the MCMC chains. 



cles) at z = 0, for four halo masses that span the relevant mass 
range of our cluster sample. The errorbars are from jackknife re- 
sampling of octants of the simulation box. Both the original and 
the modified models recover £hm on small scales (< l/i _1 Mpc) 
very well (within 10%) because of the success of the NFW profile 
in describing halo density profiles within r v i r - On the transitional 
scales between l/i -1 Mpc and 5/i -1 Mpc, both models show dis- 
crepancies with the simulation of up to 20 — 30% due to the dis- 
continuous change of the prescription between the 1-halo and 2- 
halo regimes. Beyond 5/i _1 Mpc, the modified model clearly out- 
performs the original one, agreeing with the simulation within the 
errorbars in all mass bins, and agreeing to within 10% except for 
the highest mass bin, where the measurement error is large due to 
the small number of very massive clusters in the simulation. (In 



detail, the 6h 1 Mpc prediction is outside the error bar in the two 
lowest mass bins.) 

Fig. [5] shows the same comparison between the two models 
and the simulation measurement for AE(7?), the quantity we care 
most about. Similar to the £hm case, the models agree with the sim- 
ulation to within a few per cent below ~ 2/i~ 1 Mpc, and for most 
of the transitional scales between 2/i -1 Mpc and 5/i _1 Mpc; pro- 
jection dilutes but does not eliminate the effect of the "discontinu- 
ity spikes" seen in Fig. [4] Beyond 5ft -1 Mpc, the original HW08 
model generally has some > 15% deviations from the simulation 
measurement in all mass bins, while the modified model is in ex- 
cellent agreement with the simulation, to within 5% in the three 
lowest mass bins and within the measurement uncertainties (30%) 
in the highest mass bin. Note that the S09 AE measurement also 
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has ~ 30% uncertainties on large scales. We test the effect of drop- 
ping the AE(i?) measurements for the highest richness bin in §[6] 
To bracket the redshift extent of the MaxBCG clusters, we 
have also done the comparison test using the simulation output at 
z — 0.3, and the results are similar. Based on the results of the tests, 
we conclude that the impact of scale-dependent bias on AE(i?) is 
very weak on scales beyond 5/i~ 1 Mpc using the modified HW08 
model in Equation[T3] well below the uncertainties in the S09 mea- 
surements. Therefore, we choose to use the stacked weak lens- 
ing observations beyond co-moving scale of ~ 6/i~ 1 Mpc, which 
for the redshifts of our cluster sample corresponds to a cutoff ra- 
dius -R m i n of 5/i _1 Mpc in physical units. 

4.3 Power Spectrum Shape as a Prior 

Once we have determined R m in, the large scale 3D density contrast 
profile of halos, f2 m £hm(r, M), carries clean and easily accessible 
cosmological information, as fl m £hm(p, M) oc f2 m 6(M)cr| if we 
know the shape of the matter correlation function f mm (r) well. 
However, our measurements of AE(i?) provide only limited con- 
straints on the shape of £nun(p), in part because of projection, in 
part because we use only large scale measurements, and most of 
all because the statistical errors in the AE(7?) measurements re- 
main fairly large. Uncertainty in the £ mm (r) shape would limit our 
ability to optimally combine measurements from multiple scales, 
and it would limit our ability to translate our measurements from 
these scales to a value of erg, which is defined at the specific scale 
of 8/i _1 Mpc. 

To circumvent this problem, we introduce the shape of the 
power spectrum measured from SDSS Luminous Red Galaxies by 
ReidlO as a prior in our Bayesian analysis, without introducing any 
explicit priors on individual cosmological parameters. We are tak- 
ing advantage of the fact that the shape of the power spectrum is 
well constrained observationally by galaxy clustering data, even 
though it is not well constrained by our AE(i?) measurements. 
We allow deviations from PR C idio(fc) as parameterized by An a . 
To test the performance of the P(k) shape as a prior and its sen- 
sitivity to uncertainties in AE(ii), we generate two sets of mock 
data, one using the original S09 jackknife errors, and one using half 
the S09 errors. For each mock set, we produce 10 random realiza- 
tions from the multivariate Gaussian describing the cluster counts 
and AE(-R) values, using the parameters Q m — 0.30, as = 0.80, 
°inN 200 \M = 0.36, = 1.0, IniVi = 2.4, lniV 2 = 4.2, and 
An s = 0.0, along with other cosmological parameters set as 
the WMAP7 values. We also compute the P(k) shape from the 
same cosmology, and we perform MCMC analyses on the 10 ran- 
dom realizations of each mock data set with and without using the 
P(k) shape prior. When there is no shape prior used, we fix all 
the cosmological parameters to be their WMAP7 values except for 
fi m and as, and the P(k) shape varies with Q m according to the 
ACDM prediction. 

Fig. [6] shows the effect of introducing the P(k) shape prior 
on the Q, m -as constraints for the two mock data sets. For the 
mock data with original weak-lensing errors (left two panels), 
when the P(k) shape is a priori unknown (bottom left panel), 
the analysis generally accepts an incorrect region of high-_! m and 
low-<78 within 95% confidence, allowing models in which the as- 
sociated P(k) becomes much bluer due to an earlier epoch of 
matter-radiation equality and transforms into a £ mm that is too 
strong (weak) on small (large) scales. Despite being physically un- 
likely, after projection this model leads to AE(i?) profiles that are 
consistent with the mock data within the errors. When the shape 



Table 4. Best-fit Models 



Parameter MaxBCG MaxBCG+WMAP7 





n o 9t -+0.086 
u.o_o_ 0g7 


n 9QO+0.019 
u.zao_ 020 


CT8 


0.828_ 097 


n sqi +0.020 
u.__x_ 020 


°"ln]V 200 |M 


u. _o__ ()fj8 


U.43O_ 024 





1 004+ 060 
r.uu __ 060 


U.9o8_ 030 


InJVi 


2 446+°' 142 
_. __o_ 127 


2 465+ 094 


lniV2 


A 1_K+ ' 249 
4.148_ 229 


4 1RS+ 0129 


An s 


001+ 013 
u.uu±_ 013 


001+ 001 
u.uu±_o 001 



prior is used (top left panel), the high-_i m and low-erg region is 
correctly rejected by the model, demonstrating the efficacy of the 
shape prior in eliminating the degeneracy between uncertainties in 
the P(k) shape and cosmological parameters. 

For the mock data with smaller weak-lensing errors (right two 
panels), both Q m and as are well constrained either with or without 
using the shape prior, showing that the shape degeneracy greatly di- 
minishes with the reduced weak-lensing errors even though we do 
not use small scale AE(i?) information. The contours are slightly 
more elongated when the shape prior is used (top right panel), be- 
cause it is easier for extreme values of Q m and erg to fit the ob- 
served cluster richness function when the right P(k) shape is di- 
rectly known compared to when the right n s , Qi,, Qv, and h are 
known. In other words, incorrect values of Q m now produce devi- 
ations in the shape of AE(jR) that are detectable with the smaller 
errors. Fig. [6] also demonstrates that the uncertainties of our con- 
straints depend crucially on the statistical errors in the weak lensing 
measurement on large scales — the la uncertainties in Q m and erg 
are reduced by ~ 45% after a at. shrinks by a factor of two in the 
lower right panel. 



5 PARAMETER CONSTRAINTS 

Our best-fit model is summarized in the first column of Table [4] 
where for each parameter we quote the median (50%) as central 
value and the [18.54%, 84.16%] interval as ±lcr uncertainties. 

Before presenting the detailed results of our analysis, it is in- 
structive to illustrate how the addition of large scale weak lens- 
ing measurements helps break the degeneracy in cluster abundance 
measurements among the cosmological parameters Q m and erg 
and the nuisance parameter cr ln JV20Q | A/ . The experiments shown in 
Fig.|7]are designed to serve this purpose, providing a more detailed 
form of the approximate argument sketched in the introduction. 
Starting from the fiducial model (Table [4] column 2) that matches 
both the cluster number counts and AE(i?) data, in each row of 
Fig. [7] we raise one of the three key parameters (from top to bottom: 
O m , as, and a\ n n wo \m) by a factor of 1.5 from its fiducial value 
while keeping the other two parameters fixed at their fiducial val- 
ues. We then re-fit to the cluster abundance data alone by varying 
the mean richness-mass relation. The new best-fit model in each 
row thus represents one of three families of false models that are 
indistinguishable from the underlying true model if we only em- 
ploy the cluster abundance data for constraint. With the modified 
richness-mass relation in the right column, the new best-fit model 
predicts cluster number counts that can match the original data (left 
column) but different AE(7?) profiles that cannot (middle column). 
In detail, 

• When Q. m is increased (top row), there is no change in the am- 
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Figure 7. Illustration of the underlying methodology of our analysis. In each row, one of the three key parameters (from top to bottom: Q m , <rg, and 
CT ln Nooo \M ) ' s mcrease d from its fiducial value by a factor of 1.5, and we vary the mean richness-mass relation (blue vs. gray bands in the right column) 
to find a new best-fit to match the cluster number counts (blue vs. gray histograms in the left column). The panels in the middle column then compare the 
AS(i?) profiles predicted by the new best-fit model (blue dashed curves) to the fiducial profiles (gray solid curves). Points with error bars indicate the S09 
measurements of AS(H) at 5/i _1 Mpc and 15/i _1 Mpc for the three richness bins. See the text for more details. 



plitude of clustering except for an overall change of density, so the 
halo mass and bias functions both shift uniformly to higher masses. 
Matching the observed number counts only requires a decrease in 
the overall amplitude of the mean richness-mass relation. Since 
there is no change in the cluster bias at fixed richness, the boost in 
the AE(i2) profiles is directly caused by the increase in Q m and is 
thus independent of distance and richness. 

• When erg is increased (middle row), the halo mass function 
changes amplitude and shape: the abundance of halos through- 
out the cluster mass regime increases, but the increase is larger 



at higher masses. To fit the observed abundance as a function of 
richness, the mean richness-mass relation must tilt downward, be- 
coming shallower than the fiducial relation. This change suffices 
to reproduce the original number counts, but the AE(i?) profiles 
shift upward because of the growth in the amplitude of £ mm (r). 
This growth (oc erf) is partly compensated by a reduction in halo 
bias factors, but this reduction is mass-dependent, with the conse- 
quence that AE(i?) grows more for high richness clusters than for 
low richness clusters. 

• When o"inAr 2 oo|A/ is increased (bottom row), it has very sim- 
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Figure 8. Confidence regions from our analysis of the MaxBCG data in the 2D planes that comprised of all the pair sets of model parameters. Histograms in the 
diagonal panels show ID posterior distributions of individual parameters. Contour levels run through confidence limits of 95% (light brown) and 68% (dark 
brown) inwards. The assumed prior distributions for /9 and An s are shown as dashed curves on the fourth and the seventh diagonal panel, respectively; they 
are barely distinguishable from the posterior distributions. 



ilar effect on cluster number counts as increasing as, scattering 
progressively more low mass halos up into each richness bin than 
the fiducial model. To counter this effect, the mean mass-richness 
relation drops in amplitude and becomes shallower, achieving a 
good match to the original number counts. However, the changes 
in AE(i?) are opposite to those that arise from increasing erg: be- 
cause more low mass halos scatter into a given richness bin when 
a iniV2oo|M is higher, the amplitude of AE(i?) decreases despite 
the shift in the mean richness-mass relation, dropping on both large 
and small scales. The impact of increased scatter is higher at larger 
richness because of the steeper mass function in this regime. 



For a more detailed discussion of the dependenc e of halo p opula- 
tions on f2 m and erg, we refer the reader to Zhe ng et all J2002T) . 

We plot representative measurements and error bars from the 
S09 AE(_R) data in the middle panels, specifically the 5/i _1 Mpc 
and 15/i _1 Mpc points in each of the three richness bins. The im- 
pact of the (large) Q m and erg changes illustrated here is significant 
compared to these statistical errors, and of course our full AE(i?) 
data set has six data points for each of five richness bins (see Fig.[TJ, 
with errors that are only mildly correlated. The data should thus 
have substantial constraining power. We can see that there is degen- 
eracy between Q m and erg as expected, but this degeneracy is partly 
broken by the different richness and scale dependence of the two 
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parameter effects. The impact of increasing the scatter <7i n N 20 o\ M 
by 50% is much smaller than the impact of similar changes to the 
cosmological parameters, and it is strongly richness dependent, es- 
sentially vanishing for our low richness bins. We can therefore an- 
ticipate that a rather loose prior on this nuisance parameter will be 
enough to avoid degradation of the cosmological constraints. 

The AE(7?) changes at R < 2/i~ 1 Mpc effectively illus- 
trate the origin of the RIO cosmological constraints, which use the 
MaxBCG number counts and the small scale (1-halo regime) weak 
lensing measurements. The different impact of a ag change at small 
and large scales shows why the index 7 of the best constrained 
usQZi combination will be higher for our analysis than for RIO's. 

Fig. [8] presents an overview of this paper's principal results, 
showing the ID posterior distribution for each of the 7 model pa- 
rameters (diagonal panels), and the 95% and 68% confidence re- 
gions for all the parameter pairs (off-diagonal panels). The prior 
distributions of j3 and An s are also shown on corresponding diag- 
onal panels. We will refer back to Fig.|7]and zoom in on different 
subsets of Fig.[8]multiple times in the following discussion. We are 
most interested, of course, on the constraints in the £l m -ag plane, 
but we must understand their dependence on other parameters. 

5.1 Comparison to WMAP 

The brown contours in the left panel of Fig. [9] show a zoom-in 
version of the tt m -<Js panel from Fig. [8] marking the 68% and 
95% confidence limits from our MaxBCG analysis. The error el- 
lipses are elongated approximately along a degeneracy track of 
o- 8 (n m /0.325) - 501 = 0.828 ± 0.049, with the marginalized con- 
straints Q m = 0.325±g ° f 7 and cr 8 = 0.828l{5;og3, (la errors). The 
<jg,Q,Zi alignment of the error ellipses is typically seen in cluster 
abundance-based cosmological constraints, reflecting the counter- 
balancing impact of the two parameters on the halo mass function. 
The exact value of 7 is contingent on cluster mass range and ancil- 
lary information used in each analysis, but it usually lies between 
0.4 and 0.6. 

Our results are consistent with but orthogonal to the WMAP 
seven-year resul ts, which are shown a s the red contours in the left 
panel of Fig. l9l jKomatsu et al.ll2oTlh . Our measurements pull in 
the direction of higher fl m and <j 8 relative to WMAP alone. The 
WMAP constraints rely strongly on the assumptions of the flat 
ACDM cosmological model to extrapolate growth from z = 1100 
to low redshifts. Our results are only weakly dependent on ACDM 
assumptions, as we are using empirical constraints on the shape of 
P(k) and measuring a cross-correlation that scales directly with 
the matter density and the low redshift amplitude of matter cluster- 
ing. As with cluster mass function studies, therefore, our results can 
be viewed as a consistency test of the GR + cosmological constant 
model, one that focuses on growth of structure rather than expan- 
sion history. The joint constraints from combining the two experi- 
ments shrink the regions of equivalent confidence limits to the blue 
contours, yielding fl m = 0.298±|^2o and cr 8 = O.SSltg;^. The 
best-fit joint model is summarized in the "MaxBCG+WMAP7" 
column of Table|4] 



5.2 Comparison to Other Cluster Cosmology Probes 

The right panel of Fig. [9] compares the constraints from our anal- 
ysis to t hose of two o ther studies using the same cluster sample, 
R10 and iTinker et al.l (2012, hereafter Tinkerl2). Although using 
the same underlying clusters and weak lensing measurements, the 



Table 5. Input data used in the three MaxBCG-based cosmological con- 
straints. 





R10 


Tinker 12 


This paper 


Abundance 


Yes 


No 


Yes 


Small Scale AE(i?) 


Yes 


Yes 


No 


Large Scale AS (R) 


No 


No 


Yes 


Other 


None 


Galaxy Clustering 


None 



three analyses are quite different from one another, as highlighted 
in Table [5] (Regarding the fourth row, note that galaxy clustering 
plays a tangential role in our analysis via the power spectrum shape 
prior but is central to the Tinkerl2 analysis.) 

Using the weak lensing masses, R10 obtained constraint 
o- 8 (fW0.25) - 41 = 0.832±0.033 (red contours in the right panel 
of Fig. |9). The de generacy index 7 = 0.41 is slightly shallower 
than the R10 constraint because the small scale AE(i?) responds 
more strongly to a& than the large scale AE(i?), as seen in the cen- 
tral panel of Fig. [7] The R10 errors are smaller than ours, primarily 
because of the higher S/N of the small scale AE(i?) measurements 
used in their analysis. It is worth emphasizing, however, that our 
errors are dominated by statistical errors in the AE(7?) data, while 
the R10 errors have substantial contributions from the weak lens- 
ing bias uncertainty /3 and from uncertainties on the mis-centering 
correlation. 

The blue contours in the right panel of Fig. [9] show the con- 
straint from TinkeiT2 expressed as <r 8 (fi m /0.290) ' 5 = 0.863 ± 
0.048. Tinkerl2 derived constraints from measuring the mass-to- 
number ratio within MaxBCG clusters, therefore using the same 
scales of AS(_B) as R10. They employed additional information 
from lZehavi et all s (2011) galaxy clustering measurements, which 
they used to constrain the halo occupation distribution (HOD) 
within clusters as a function of cosmology. The dominant uncer- 
tainties in the Tinkerl2 error bars are systematic, from uncertain- 
ties in evolution of the galaxy luminosity function and the galaxy 
HOD, and from theoretical uncertainties in the halo mass function 
and halo bias relation. 

In the right panel of Fig. [9] the three sets of contours largely 
overlap with each other, around the region that corresponds to the 
68% confidence limit from our WMAP7 joint constraints shown 
in the left panel. This good agreement is encouraging, since our 
analysis uses different scales of AE(i?) and is insensitive to sys- 
tematic uncertainties that affect the other two analyses. There are 
enough commonalities that it would be risky to combine the three 
constraints as though they were independent, but the agreement cer- 
tainly suggests that the errors are no larger than those quoted in 
the individual studies. As the R10 and Tinkerl2 results have been 
shown to agree with the constraints from X-ray cluster studies, the 
consistency between our results and theirs expands the evidence for 
broad consistency between optical and X-ray studies in cluster cos- 
mology. As statistical precision improves with larger samples (of 
clusters, of WL source galaxies, and of high quality X-ray mea- 
surements), these consistency tests will become considerably more 
stringent. 

5.3 Constraint on the Richness-Mass Relation 

For specified values of Sl m and <r 8 , the statistical mass calibration 
in our analysis places strong constraints on the richness-mass rela- 
tion. The inferred relation shifts systematically with Q m and <r 8 , as 
the constraint comes mainly from matching the observed richness 
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Figure 9. Comparison of our cosmological constraints on the Qm-ag plane with WMAP7 and two other studies using MaxBCG. Left: Constraints from our 
MaxBCG analysis (brown), WMAP7 (red), and the combination of both (blue). Right: Constraints from our analysis (brown), R10 (red), and Tinkerl2 (blue). 



distribution given the halo mass function. The model parameters 
are In TYi and In N2, but the constraint can be expressed in a more 
intuitive form by changing parameters to 



(lniVaoolM) = A + a (In M/M pivot ) 



(18) 



where the pivot mass is chosen to minimize correlation between 
the amplitude A and slope a. Even for fixed cosmology, the values 
of A and a are inevitably correlated with the scatter (Ti b n 200 \m> 
because it is the combination of the central relation and the scatter 
that determines the distribution of richness as a function of mass. 
As shown in the bottom panel of Fig. [Vj one can reproduce nearly 
the same cluster number counts by increasing tr ln JV200 1 M , lowering 
A, and adopting a slightly shallower slope a. However, the rich- 
ness dependence of AE(7?) provides some leverage to break this 
degeneracy. 

For the cosmological parameters and scatter of our best-fit 
MaxBCG+WMAP7 model, we obtain A = 3.233 ± 0.020 and 
a = 0.739 ± 0.009 with a pivot mass M pivo t = 3.614 x 
10 14 /i _1 Mq, which minimizes the correlation between the two pa- 
rameters. These (68%) errors are marginalized over the nuisance 
parameters j3 and An s . If we also marginalize over o- lnJ v 200 |A/ but 
use the same pivot mass, the errors increase to A — 3.255 ± 0.069 
and a — 0.744 ±0.016. We evaluate the dependence of A and a 
on fi m , as, and (7\ b n wo \m by shifting each parameter in turn by 
±10% from its best-fit MaxBCG±WMAP7 value and redetermin- 
ing the best-fit A and a (i.e., similar to the experiments described 
in Fig.|7]but with smaller fractional shifts.). Our final result is 



.4 



yt-n 



,0.298 
x (3.233 ± 0.020) 



/ (78 \-U-"» / (Tl n 

V 0.831/ V ( 



0.254 /a,.. J y 20Q | M N -0.055 



0.426 



,0.298 
x (0.739 ± 0.009) 



( °8 \-0- 44 6 / g lnAf200 | M y 
V 0.831/ V 0.426 / 



with pivot mass Mpi vo t = 3.614 x 1O 14 /i _1 M . Away from the 
fiducial values of Equation [T9j the errors on A and a may change, 
and they may become moderately correlated as the effective pivot 
mass drifts. 

To compare to the best-fit (lniVsoo|Af) in R10, we scale our 
best-fit A and a to their fiducial cosmology and scatter using the 
equations above, and normalize the amplitude to their M p i V ot = 
7.63 x W^hg lMe, the best-fit values are then A' = 2.140 and 
a' = 0.890, consistent with the R10 constraints (^rio = 2.34 ± 
0.10 and qrio = 0.757 ± 0.066). Our richness-mass relation is 
rotated upward from the R10 relation by ~ 4.5° around 3.432 x 
10 14 /i - 1 Mq, where the two cross. 

We list the average mass and bias of MaxBCG clusters in bins 
of richness as computed from the best-fit MaxBCG±WMAP7 joint 
model in Tables Q] and [2] More specifically, we take the best-fit 
model parameters and calculate the kernel function weighted aver- 
ages via 

(M 20 om) 
(b) 



1 

W) 



Muj j (M,z)dMdz (20) 
b(M,z)uj(M,z)dMdz. (21) 



(19) 



Note that the values of (M200m) depend on both the scatter and 
the slopes of the halo mass function through cjj(M, z), so they are 
different than the "central" values from directly inverting the mean 
richness-mass in Equation. [18] This is most easily seen from the 
top panel of Fig. [2] where the actual halos inside each richness bin 
are mostly those with masses well below the "central" mass implied 
by the mean richness-mass relation for that bin. The mean halo 
masses are less strongly offset, as shown by the points in that panel, 
because the more massive clusters in the bin carry proportionally 
more weight. Our average cluster masses are in good agreement 
with the predictions from R10 (their table 2, colu mn 3), which in 
turn ar e fits to the weak lensing masses measured in I Johnston et all 
d2007h . 
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Figure 10. Comparison of the correlations between the cosmological pa- 
rameters (Q m and erg) and the nuisance parameters (o"i n jv 200 |Af an< ^ P)< 
before and after applying specific priors. In each panel, light and dark brown 
contours indicate the 68% and 95% confidence regions from our fiducial 
analysis, which has priors on the nuisance parameters, and blue contours 
show the results when there is no prior either on (Xi n jv 200 1 M (bottom pan- 
els) or on ft (top panels). Dotted curves on top of each contour set represent 
the degeneracy track followed by the correlation, calculated from the eigen- 
decomposition of the underlying correlation matrices. 



5.4 Constraint on the Nuisance Parameters 

As identified in Fig. [7] the main nuisance parameters in our analy- 
sis are the scatter in the richness-mass relation (j ln jv 200 |Af and the 
weak lensing bias /?, both of which are expected to correlate with 
the cosmological parameters. To mitigate the impact of these corre- 
lations, we have placed two Gaussian priors to help determine the 
ranges of cr lnJV200 | M and /3, one on the converse scatter and one 
directly on /3 (Tabled 

Fig. 1 101 compares the correlations between the two nuisance 
parameters (<7i n jv 200 |M and /3) and the two cosmological param- 
eters (Q m and erg), before and after using their respective priors. 
As expected, when the prior on the converse scatter cti iiM |jv 200 



is absent, we observe strong correlation between o\ 



n iV 200 | M 



and 



cr 8 , scaling as ag(cr lnJ v 200 | M /0.534)- a483 = 0.953 (blue dot- 
ted curve in the bottom right panel). The prior on a"i n M|jv 200 ef- 
fectively eliminates the high-cr ln jv, 0Q |M and high-erg region and 
shifts the residual correlation to <7g(cr ln] v 200 |A//0.432)~ ' 505 = 
0.828 (black dotted curve in the bottom right panel). Similarly, if 
we allow f3 to vary freely, there is correlation between Q m and 
P, scaling as /3(f2 m /0.349)~ ' 711 = 1.14 (blue dotted curve in 
the top left panel), which then diminishes under the prior on f3 



to /3(fi m /0.325)- 



1.0 (black dotted curve in the top left 



panel). There are almost no correlations between ctj 



nJV 200 |M 



and 



Q, m (bottom left panel) and between /3 and erg (top right panel), 
either before or after imposing the priors. 

The slope of the Q. m -P correlation in the no prior case, how- 
ever, is intriguing (—0.711 as shown by the blue dotted curve in 



the top left panel). Consider the experiment illustrated in the top 
panels of Fig. [7] where we increase Q. m while keeping the shape 
of -Piin(&) and the z = normalization erg fixed. At z = 0, the 
halo mass function and halo clustering of this shifted model are 
nearly identical to those of the original model except for an over- 
all sh ift of the mass scale in proportion to Q m (see Zhe ng et"ai1 
2002). Once the richness-mass relation is shifted by this same con- 
stant factor, we expect a nearly identical cluster-mass correlation 
function at fixed richness. We therefore expect AE(i?) oc f2 m £ cm 
to shift in proportion to Q. m , so there should be degeneracy of the 
form jflf^ 1 = constant, rather than /3f^ ' 711 . Our clusters have a 
median redshift of z mcc i = 0.23 rather than zero, and the amplitude 
vsizmed) has a small dependence on Q. m through the linear growth 
factor. However, the departure from unit slope in this degeneracy 
arises primarily because of the dependence of volume element on 
Q. m . When Q. m increases the inferred volume of our cluster survey 
(defined by nearly fixed redshift limits and area) decreases, so the 
linear shift of the richness-mass relation that keeps the predicted 
cluster abundance fixed in comoving /i 3 Mpc~ 3 in fact predicts a 
lower number of MaxBCG clusters. 

Marginalizing over all other parameters, our constraints 
on cr lniV200 | M and /3 are cT lnA r 200 | A/ = 0.432t°'og| and 
/3 — 1. 004lQ Qgo, respectively, consistent with the results in 
R10 (cr lniV200 | M = 0.357 ± 0.073 and /3 = 1.016 ± 0.060). 
Applying WMAP7 priors tightens the constraints to <Ji n N 20 o\ M = 
0.436lo;°24 and /9 = 0.968t°;o3o. As we have already commented 
in £|5 - 3 1 there is mild tension between the scatter found here and in 
R10; the difference is only lcr, but we are using the same cluster 
abundance data. 



6 SYSTEMATIC ERRORS 

We have adopted priors on several of our nuisance parameters, so 
there could be systematics beyond our quoted errors if these priors 
are too tight, or if our assumption of a power-law richness-mass 
relation is too restrictive. 

Fig. QT| compares the 68% confidence regions in the 
plane derived from our fiducial analysis (filled contours) to those 
obtained for different priors or variations in the observational anal- 
ysis (open contours). 

The bottom left panel explores the robustness of our analysis 
against uncertainties in the scatter. When no prior on scatter is ap- 
plied, the \ow-Q m and high-erg regions (red solid contour) are ac- 



cepted because of the degeneracy between erg and cr ln 



iV200|A/ 



dis- 



cussed in § 15.41 Our fiducial prior on the converse scatter, Gaussian 
with width 5cr ln jv 200 |m = 0.10, makes an important difference to 
our individual errors on Q. m and <rg, though it has little impact on 
the (TgfiJn 501 error (the degeneracy banana gets longer, not wider). 
If we double the prior width to <5<7i n jv 2u o|Af = 0.20 (blue con- 
tour), the cosmological constraints are close to the no prior case. 
However, if we halve the prior width to 5cr ln n 2Q q\m = 0.05 (green 
contour),there is only modest tightening of the constraints; at the 
current level of statistical error in AE(_R), better external knowl- 
edge of cr ln jv 20 ol*f WOUl d not make much improvement in our re- 
sults. Compared to Fig. [9] we see that the additional parameter re- 
gion allowed by a loose prior on scatter is ruled out if we combine 
with the orthogonal WMAP7 constraint. This is the reason that the 
posterior constraint on cr ln n 20 o\ m ls mucn tighter when we include 
WMAP7 (Table @, even though WMAP does not probe clusters 
directly. 

The bottom right panel of Fig. QT] tests the robustness of our 
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Figure 11. Effects of relaxing priors and varying assumptions on observational systematics on the Q m and <rg constraint. Contours are 68% confidence 
regions as constrained by our fiducial analysis (filled) or different modifications listed in each panel (open). 



results against uncertainties in j3. As discussed in § 15.41 our in- 
ternal constraint on /3 comes almost entirely from the volumetric 
effect of Q m , so we expect widening/narrowing of the Gaussian 
prior on j3 to have a much larger impact on Q. m than on erg. The 
blue dashed and green dotted contours show the results after we 
double and halve the width of the prior on f3, respectively. The blue 
contour expands along the Q m axis with little change along the as 
axis, as anticipated. Halving the prior width produces only slight 
improvement in the constraints; if our fiducial prior 5/3 = 0.06 
is accurate-to-conservative, as we think it is, then our constraints 
are limited by the statistical errors of the weak lensing measure- 
ments rather than the systematic uncertainties. If the prior on j3 is 



dropped completely, i.e., we allow arbitrary rescaling of the weak 
lensing measurements and rely only on the relative AE(i?) ampli- 
tudes between bins of different richness, then the contour expands 
to fill nearly all possible Q. m ranges while showing no degradation 
of the constraint on as- 

The top left panel of Fig. Qj] addresses the possible system- 
atics associated with our uncertainties in the P(k) shape. The 
red solid contour shows the constraints on Q m and ag without 
the P(k) shape prior. As demonstrated in § 14.31 this prior helps 
eliminate the physically improbable regions of high-Sl m and low- 
as, which is otherwise favored due to the degeneracy between the 
P(k) shape and the cosmological parameters. The blue dashed and 
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Figure 12. Effects of allowing curvature in the richness-mass relation on the Q m and a$ constraint (left) and the best-fit curved richness-mass relation (right). 
Left: comparison of the 68% and 95% confidence regions derived from the fiducial (filled) and the curved model (open). Right: The best-fit power-law (thick 
solid line) and piecewise spline interpolated (thick dashed curve) mean richness-mass relations. The gray band and the thin dashed curves indicate the scatter 
about the mean relations. The two solid vertical lines indicate Mi and M2, while the dashed line indicates the additional tenor point M3 for the model with 
curvature . 



green dotted contours represent the 68% confidence regions after 
increasing and decreasing the width of the Gaussian prior on An s 
by factor of two, respectively. Both contours barely differ from 
the fiducial result, indicating our constraints on Q m -ag are robust 
against the uncertainties in the tilt of the primordial power spectrum 
within the range allowed by galaxy surveys. 

In the top right panel of Fig. QT| we examine the robustness 
of our constraint against the uncertainties in the completeness and 
purity level of the sample and its sensitivity to the AE(i?) mea- 
surements in the extreme richness clusters. As discussed in ^3.2.21 
we allow for potential biases related to incompleteness or contam- 
ination by adding Var(A) = 0.05 2 to all elements of our covari- 
ance matrix, diagonal and off-diagonal, based on estimates that the 
MaxBCG catalog is at least 95% complete and pure in our richness 
range. The red contour shows the effect of dropping this contribu- 
tion to the covariance matrix, which is negligible. We conclude that 
uncertainties in completeness and contamination at the 5% level do 
not affect our constraints. 

For the richness dependence of A, it is known that low richness 
clusters are subject to a higher rate of contamination than rich clus- 
ters, so we try our analysis excluding the number count datum for 
the lowest richness bin at -/V200 £ [11 — 15]. The result is shown as 
the blue dashed contour, which drifts up from the fiducial one ap- 
proximately along the degeneracy track. The constraints from the 
blue dashed contour are <7 8 (fW0.293) a489 = 0.863 ± 0.049, 
which slightly torques the halo mass function to better fit the cluster 
richness function beyond the lowest bin. Since the abundance error 
is smallest for our lowest richness bin, it carries significant weight 
in the analysis, so it can have a noticeable impact even though it 
is only one of nine abundance data points. However, the drift of 
the contour is small compared to its size, so even contamination 



at the 5% level in this bin would affect our result at a level small 
compared to the statistical error. 

As for the stacked AE(i?) measurements, the largest uncer- 
tainty occurs at the highest richness bin where the total number of 
source galaxies is the least. The green dotted contour shows the 
effect of removing the highest richness bin in the AE(i?) measure- 
ments. The resulting confidence region elongates to accept some 
high-erg regions, because despite being noisy, the highest richness 
bin carries more erg-sensitive information than other bins. 

Although the average mapping between the true mass and op- 
tical richness of clusters should be monotonic, a power-law may 
be an over-simplification. To allow curvature in the mapping, we 
add a third parameter in the mean richness-mass relation as the 
mean log-richness lniV 3 at M 3 = 4.1 x lO 14 ft~ 1 M (i.e., the ge- 
ometric mean of Mi and M2), then we spline interpolate through 
the three tenor points on the log-mass vs. log-richness plane for 
any given In iVi, In N2, and In N3, to find a smooth curve that rep- 
resents the new mean richness-mass relation. Fig. [T2] shows the 
results of adding curvature at M3. Similar to the test where we 
dropped the lowest richness bin in §[6j the confidence regions slide 
up along the degeneracy track for a gentle torque in the halo mass 
function (blue open contours in the left panel), and the richness- 
mass relation bends slightly (blue dashed curve in the right panel) 
to reduce the number of the highest richness clusters. In this way, 
the model achieves a better fit to the detailed shape of the ob- 
served richness function than the fiducial power-law model. How- 
ever, the best-fit curved richness-mass relation remains very close 
to a power law over the relevant mass range, and the parameter con- 
straints shift only slightly relative to the fiducial ones. The width of 
the constraints grows only small amount, indicating that our as- 
sumption of a power-law relation in our fiducial analysis does not 
bias or overly restrict our results. 
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7 SUMMARY AND FUTURE PROSPECTS 

We have derived cosmological constraints on Q m and erg using the 
combination of large scale cluster-galaxy weak lensing measure- 
ments (S09) and the abundance of MaxBCG clusters as a function 
of richness. Within the analysis, we have statistically calibrated the 
cluster masses by requiring consistency between the cosmological 
model fit and the data, exploiting external priors on the s catter in the 
richn ess-mass relation from comparisons to X-ray data dRozo et aT 



2009), and on the P(k) shape from galaxy clustering IReid et al 



( 20 1 0) . The 68% confidence ellipse of our cosmological constraints 
on the Q m -as plane can be summarized as 



cr 8 (ft m /0.325) - 501 = 0.828 ± 0.049, 



(22) 



which is consistent with and orthogonal to the WMAP7 constraints 
on these parameters. This consistency of structure measured in the 
recombination era and the low redshift universe provides further ev- 
idence for the gravitational growth predicted by the ACDM model 
combining GR, a cosmological constant, and cold dark matter. As- 
suming this model to be correct and combining our analysis with 
WMAP7, we obtain individual constraints as 



n m = 0.298 ± 0.020 and cr 8 = 0.831 ± 0.020. 



(23) 



The overall results are consistent with and complementary to two 
other cosmological constraints from the same underlying clusters 
but with different input data and systematic uncertainties (R10 
and Tinkerl2). Collectively these three studies demonstrate con- 
sistency of the small scale weak lensing, large scale weak lensing, 
galaxy content, and abundance of the MaxBCG sample, together 
with galaxy clustering data. 

The primary systematic uncertainties in our analysis are the 
scatter in the cluster richness-mass relation and residual bias in 
the weak lensing measurements associated with photometric red- 
shifts or shear calibration. However, with the external priors we 
have adopted, neither of these systematics is a limiting factor in our 
analysis; the uncertainties in our cosmological constraints are dom- 
inated by statistical uncertainties in the large scale AE(i?) mea- 
surements. These statistical errors can be sharply reduced in future 
surveys with deeper imaging and better seeing. Statistical improve- 
ments will require corresponding improvements in the control of 
systematics. While we have focused in this paper on large scales 
to complement the R10 analysis, the long term goal should be to 
derive constraints from the full range of AE(i?) simultaneously. 
Achieving this goal will require theoretical and numerical work 
to construct models that are accurate across the 1-halo and 2-halo 
transition and to assess uncertainties in the accuracy of the model 
predictions at all scales. 

There are opportunities for significant near-term improve- 
ments in our analysis using SDSS data. The MaxBCG catalog 
and weak lensing measurements used here are bas ed on imag- 
ing data from DR4. With the recent release of DR8 dAihara et alj 
l201ll) . almost every aspect of the catalog construction and the weak 
lensing measurements has evolved. The increase in the imaging 
area will enhance the raw statistical power (for 7, 398 deg 2 VS. 
14, 555 deg 2 ), reducing Poisson uncertainties and sample variance 
in the cluster counts and shape noise in the AE(i?) measurements. 
The optical cluster finding algorithm has been improved to pro- 
duce catalogs with well-controlled selection function and, more im- 
portantly, a new ri chness estimator with reduced intrinsic scatter. 
iRvkoff etal.U2012h considered various modifications of the origi- 
nal richness estimator in MaxBCG and found that the scatter in log- 
mass at fixed richness could be reduce to 0.2 — 0.3 depending on 



richness, substan tially smaller than MaxBCG scatter (0.45 ± 0.10; 
IRozo et alll2009l) that we adopted as a prior in our analysis. When 
the scatter itself is smaller, then the systematic uncertainty tied to 
uncertainty in the scatter is also smaller. With improved uncertainty 
of the selection function, it will be feasible to use higher redshift 
clusters in the analysis, and while the AE(iZ) measurements will 
degrade at higher z because of reduced source surface density, the 
leverage of a wider redshift range may strengthen the cosmologi- 
cal constraints. On the weak lensing side, the main improvement 
is a better understanding of the photometric redshift distribution 
of source galaxies. With a much imp roved spectroscopic training 
set and better photometric calibration, Shel don et alj i20l ll) recon- 
structed a redshift distribution for DR8 imaging data that is pri- 
marily limited by sample variance. Additional improvements come 
from updates in the photometric pipeline, including better sky sub- 
traction, more refined stellar masks, and better PSF corrections in 
the shape measurements. 

Beyond SDSS, our approach can be applied to future, 
deeper, large-area imaging surve ys. In the near term, the Pan- 
STARRS1 (PS1; Chambers 2007) 37r survey is expected to have 
larger area than SDSS, slightly greater depth, and higher image 
quality that yields a significant increase of t he source density fo r 
weak lensing. The Dark Energy Survey (DES; Collaboration 2005), 
expected to start in late 2012, plans to survey 5000 deg 2 to a 
depth two magnitudes beyond SDSS, with a weak lensing source 
density a factor of ten higher. It is designed with cluster cos- 
mology and weak lensing as central goals, and our technique is 
naturally adapted to it. In the longer term, the imaging data sets 
from LSST, and the Euclid and WFIRST missions will allow rad- 
ical improvements in the precision of cluster-galaxy lensing anal- 
ysis, with effective source densities of 20 — 40 arcmin~ 2 . These 
imaging surveys can provide their own cluster catalogs identified 
from the galaxy population, and they can provide stacked weak 
lensing measurements for clusters identified via X-ray emission 
or the SZ effect. Comparisons of results from different classes 
of cluster catalogs allow powerful cross-checks for systemat- 

i — 1 1 — — inn, 

ics (see, e.g.. IRozo et alj 12012a b c) and valuable constraints on 
mass-observable relations dCunha & Evrara 120101) . The SZ ef- 
fect is a powerful t echnique for finding massive clusters at very 
high redshifts (e.g., 



;chnique tor finding massive clusters at very 
Reichardtet all 12012; Willi amson etafll201ll ; 



iMarriage et alj|201 1 ). and DES will target the area covered by the 
SZ survey of the South Pole Telescope. X-ray observables may be 
more tightly correlated with halo mass than optical observables, 
and the upcoming eROSITA mission will carry out a sensitive all- 
sky X-ray survey that will revolutionize cosmological studies with 
X-ray selected clusters. 

lOguri & Takadl < l201ll) and lWeinberg et alj J2012h argue that 
cluster abundances with masses calibrated by stacked weak lensing 
can provide constraints on structure growth that are highly compet- 
itive with those from cosmic shear analysis of the same WL survey. 
This conclusion assumes AE(i?) measurements out to ~ 1 — 2 
cluster virial radii, so the larger scale analysis illustrated here can 
only strengthen the power of this approach. Cluster-galaxy lens- 
ing is analogous to galaxy-galaxy lensing, but the relation between 
clusters and halos is simpler than the relation between galaxies and 
halos, and it is less subject to the complexities of baryonic physics. 
This greater simplicity reduces systematic uncertainty associated 
with theoretical modeling, which may ultimately compensate for 
the rarity of clusters relative to galaxies (and consequent lower sta- 
tistical precision of the WL measurements). The Tinker 12 study 
shows that the constraining power of small scale measurements 
can be enhanced by bringing in additional information from galaxy 
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clustering and cluster mass-to-number ratios. In future work, we 
will investigate the generalization of this idea to large scales using 
the cluster-galaxy cross-correlation function, which can be mea- 
sured in projection from the same survey used for weak lensing 
analysis. Nature has provided observable signposts that mark the 
locations of the most massive halos in the universe, and stacked 
weak lensing provides a tool to measure the average mass profiles 
of these halos at high precision over a wide range of scales. Exploit- 
ing this combination promises to yield stringent tests of gravity on 
cosmological scales and of theories for the origin of cosmic accel- 
eration. 
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